A Comparison of Depression and Mental Distress Indicators, Rhode Island Behavioral Risk Factor Surveillance System, 2006

Introduction Depression is a public health concern that warrants accurate population estimates. The patient health questionnaire 8 (PHQ-8) offers high sensitivity and specificity for assessing depression but is time-consuming to administer, answer, and score. We sought to determine whether 1 of 3 simpler instruments — the shorter PHQ-2 or 2 single questions from the health-related quality of life (HRQOL) module of the Behavioral Risk Factor Surveillance System (BRFSS) — could offer accuracy comparable to the PHQ-8. Methods We compared the depression and mental distress indicators of 2006 Rhode Island BRFSS data by using 4 types of analyses: 1) sensitivity and specificity estimates, 2) prevalence estimates, 3) multivariable logistic regression modeling of the relationship between each of the 4 indicators and 11 demographic and health risk variables, and 4) geographic distribution of prevalence. Results Compared with the PHQ-8, the 3 other measures have high levels of specificity but lower sensitivity. Depression prevalence estimates ranged from 8.6% to 10.3%. The adjusted odds ratios from logistic regression modeling were consistent. Each of the indicators was significantly associated with low income, being unable to work, current smoking, and having a disability. Conclusion The PHQ-8 indicator is the most sensitive and specific and can assess depression severity. The HRQOL and PHQ-2 indicators are adequate to obtain population prevalence estimates if questionnaire length is limited.


Introduction
Depression is a public health problem. In 2000, it was rated as the fourth leading cause of disease globally, accounting for 4.4% of total disability adjusted life years (1)(2)(3). Depression affects approximately 14.8 million American adults, or 6.7% of the population (4), and is linked to risk behaviors such as smoking, alcohol use, and physical inactivity (5)(6)(7). It is also associated with chronic diseases such as diabetes, asthma, arthritis, cardiovascular disorders, and cancer (2,8). Depression is underdiagnosed and undertreated (1,2,6). Getting a reliable estimate of depression prevalence in the general population is important but presents challenges. Several measures exist for detecting depression in clinical settings (9), and multiple selfreported questions used for clinical diagnosis have been adapted for use on population-based surveys (6,10). The challenge is finding a simple means to estimate depression prevalence in the general population.
The Behavioral Risk Factor Surveillance System (BRFSS) Centers for Disease Control and Prevention • www.cdc.gov/pcd/issues/2011/mar/10_0097.htm The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.
is a telephone survey administered in all 50 states, the District of Columbia, Puerto Rico, the US Virgin Islands, and Guam with funding and specifications from the Centers for Disease Control and Prevention (CDC) (11). The BRFSS monitors the prevalence of behavioral risks for the leading causes of disease and death among adults in the United States (11). A 9-question health-related quality of life (HRQOL) module has been available since 1995 (12). In 2006, a 10-question depression and anxiety (D&A) module was also made available to states. Rhode Island was the only state to include both modules on its 2006 BRFSS questionnaire.
We compared depression and mental distress estimates from the HRQOL and D&A modules using data from Rhode Island's 2006 BRFSS. Two of the 5 items that are related to mental health in the HRQOL module were used, 1 for sad/blue/depressed and 1 for frequent mental distress. Two measures from the D&A module were used. The D&A module includes the patient health questionnaire 8 (PHQ-8), which is used to create a 5-point scale for depression severity based on an algorithm using responses to 8 questions. Severity scores can be grouped to create a dichotomous variable for current depression (6,9). The first 2 questions of the PHQ-8, called the PHQ-2, is also used to provide a simple measure of current depression (9). Our hypothesis was that either of the HRQOL questions or the PHQ-2 can serve as a proxy for the PHQ-8 on the BRFSS. The objective of this study was to assess whether 1 or 2 questions on depression and mental distress can yield prevalence estimates of depression comparable to those from the PHQ-8, which has a high degree of sensitivity and specificity.

Study design
We

Depression and mental distress indicators
Two depression and mental distress questions are on the HRQOL module. They asked respondents to estimate how many days in the past 30 days they experienced the following: "felt sad, blue, or depressed" (14 or more days = frequent depressive symptoms), and "mental health, which includes stress, depression, and problems with emotions, was not good" (14 or more days = frequent mental distress). The authors selected the 14-day minimum period because clinicians and clinical researchers often use this period as a marker for clinical depression disorders. In addition, most of the publications we reviewed that use the BRFSS HRQOL indicators use the cutoff of 14 or more days (14)(15)(16)(17)(18)(19)(20)(21)(22)(23). Adopting this precedent ensured comparability with other studies.
The PHQ-8 contains 8 of the 9 criteria for diagnosis of major depression as defined in the fourth edition of the Diagnostic and Statistical Manual of Mental Disorders. These questions ask the respondent to indicate how many days each of the following has occurred in the past 2 weeks: 1) had little interest or pleasure in doing things; 2) felt down, depressed, or hopeless; 3) had trouble falling asleep or staying asleep or sleeping too much; 4) felt tired or had little energy; 5) had a poor appetite or ate too much; 6) felt bad about yourself, or felt that you were a failure or had let yourself or your family down; 7) had trouble concentrating on things, such as reading the newspaper or watching television; 8) moved or spoke so slowly that other people could have noticed, or being so fidgety or restless that you were moving around a lot more than usual. The number of days for each question is converted to points (0-1 day = 0 points; 2-6 days = 1 point; 7-11 days = 2 points; and 12-14 days = 3 points), and the number of points is totaled for the 8 questions to determine a depressive symptoms severity score (6,9). If a response to any of the 8 questions was missing, a score was not calculated. Five severity categories are defined: no, mild, moderate, moderately severe, and severe depression. For the dichotomous variable, a score of 0 to 9 points, which is no and mild depression, was defined as no depression, while a score of 10 to 24 points, which was the other 3 categories, was defined as current depression (6,9,24). The PHQ-2 is the first 2 questions of the PHQ-8 that inquire about depressed mood and anhedonia. A score of 0 to 2 points is defined as no depression; a score of 3 to 6 points is defined as current depression (9). If a response to either of the 2 questions was missing, a score was not calculated. The proportion of records with missing values for the 2 single The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.

Risk factors, health conditions, and demographics
For the analysis, we chose 3 health risk behaviors: current smoking, chronic alcohol use, and no leisure-time physical activity (PA); 4 health conditions: asthma, diabetes, obesity, and physical disability; and 4 demographic measures: age, sex, income, and employment status. We selected these risk and demographic factors based on our earlier work (25). We dichotomized some covariates for the analysis (ie, sex, current smoking, alcohol use, PA, asthma, diabetes, obesity, and disability), and the other covariates had multiple categories (ie, age, income, and employment status). The definitions of the 11 covariates are available in our previous article (25).

Analysis
Multiple imputation has been extensively applied to account for missing data in survey samples (26,27). To maintain maximal sample size and retain all valid data, we simulated missing data for all variables using multiple imputation. In our study, depending on the analytical model, 24% to 30% of the 4,515 records in our data set had missing data for 1 or more of the 11 predictor or 4 outcome variables. Therefore, to retain all records, we imputed missing values for age, sex, race, marital status, education, employment status, health insurance, smoking, drinking, PA, asthma, diabetes, obesity, disability, squareroot transformed income, the 5 mental health items in the HRQOL module, and the 10 D&A items. Analyzing the data without imputation did not change our conclusions (25).
Results for these different indicators were compared in 4 ways. First, using the PHQ-8 indicator as the standard, we compared the sensitivity and specificity of the 3 simpler measures. Second, we compared prevalence estimates generated by the 4 measures. Third, using multivariable logistic regression, we compared the relationship between each of the indicators and 11 demographic and health risk variables. Finally, we compared the geographic distribution of prevalence for 2 of the 4 indicators using geographic information system (GIS) mapping.
SAS version 9.1 (SAS Institute, Inc, Cary, North Carolina) was used for all analyses because it can adjust for the BRFSS complex sampling design. We calculated the sensitivity and specificity of the 2 HRQOL indicators and the PHQ-2 indicator, compared with the PHQ-8. Four logistic regression models were used to calculate adjusted odds ratios (AORs) and 95% confidence intervals (CIs) to assess the effect of each of the 11 risk factors for each of the depression indicators. All statistical inferences were based on a significance level of P (2-sided) < .05 calculated by using the Wald χ 2 test. The results of analyses for each of the indicators were compared with one another.
ArcGIS 9.0 (Environmental Systems Research Institute, Inc, Redlands, California) was used to map depression prevalence estimates by cities and towns by zip code. We chose to use the Jenks Optimization (also called Jenks Natural Breaks Classification method) to create the value ranges of sad/blue/depressed and PHQ-8 current depression depicted on the GIS maps.
We chose to use natural breaks rather than defined interval classification as used by others (28,29). Defined interval classification allowed us to specify an interval by which to equally divide a range of attribute values. We wanted to judge whether the distributions depicted in the GIS graphs are consistent with one another. With defined interval classification, similar features can be placed in adjacent classes, or features with widely different values can be put in the same class. The resulting maps can be misleading.

Results
Compared with the PHQ-8, each of the 3 indicators had a high level of specificity, ranging from 94.4% to 96.4%; the PHQ-2 had a slightly higher negative predictive value than the other 2 indicators (Table 1). The sensitivity of the 3 indicators was weaker, ranging from 59.4% to 66.8%. The PHQ-2 had higher sensitivity than HRQOL indicators. The positive predictive values for the PHQ-2 and the sad/blue/depressed indicator were almost identical; the "frequent mental distress" indicator had a lower positive predictive value.
In the HRQOL module, the prevalence of frequent mental distress among Rhode Island adults was 10.3% and of sad/blue/depressed was 8.9% (Table 2). In the PHQ-2 and PHQ-8, the prevalence of current depression was 9.8% and 8.6%, respectively. Results of tests for significance for sad/blue/depressed and for PHQ-8 were consistent for the 11 demographic and risk variables with the exception of  The prevalence of sad/blue/depressed in Providence (excluding the affluent east side), West Warwick, and Warwick ranged from 11.7% to 13.6%, higher than the rest of the state (Figure 1). The prevalence in Woonsocket, Central Falls, Pawtucket, North Providence, Johnston, Rumford, East Providence, Cranston, and Riverside ranged from 7.3% to 11.6%. These areas with higher depression rates include the more urban areas of the state, which have a higher proportion of low-income households than do the suburban and rural areas.
The prevalence of current depression in Providence (excluding the affluent east side), Central Falls, Pawtucket, Warwick, and West Warwick ranged from 11.3% to 15.5% ( Figure 2). The first 3 cities are urban with a high proportion of low-income and minority residents. The prevalence in Woonsocket, East Providence, Coventry, Greene, West Greenwich, and East Greenwich ranged from 7.8% to 11.2%. Other than East Providence, these are suburban areas. The remainder of the state, with the lowest rates for current depression, is largely suburban and rural.

Discussion
Our analysis showed that any of the 3 shorter items provide results comparable with those of the PHQ-8 in estimating overall prevalence of depression and mental distress, identifying high-risk populations, and identifying significant associations with risk variables. We recommend use of any of the 3 shorter items as a proxy on the BRFSS or similar population-based surveys to obtain a population estimate of depression prevalence. Any one of the 3 is adequate for use in descriptive analyses of population data. They provide an efficient means of assessing depression prevalence in adult populations when survey efficiency precludes use of the longer PHQ-8. However, the PHQ-8 should be preferred for surveys requiring a high level of sensitivity as well as specificity, accuracy in reliably assigning depression severity status to specific respondents, or requiring assessment over time of population changes in depression severity.
The strengths of the PHQ-8 are its high degree of both sensitivity (88%) and specificity (88%) for major depression (6,9), its adequacy for diagnosis, its ability to assess severity of depression, and its validation in the general population (6,9,10). The weakness of the PHQ-8 is its length, which makes it time-consuming to administer and answer, and its complex scoring algorithm. Although the simple question measures of depression are inadequate for diagnosis (9), they are useful for screening for depression. Sad/blue/depressed and frequent mental distress are single questions, very easy to answer and administer, with few training requirements. The PHQ-2 is easy to answer and administer, and the scoring algorithm is simple (9).
The test characteristics and the test performance of the PHQ-2 were more sensitive than the 2 HRQOL indicators. Frequent mental distress was more concordant with the PHQ-8 than was the PHQ-2 in distinguishing demographic groups and risk factors with higher prevalence of depression. Based on the AUC, sad/blue/depressed is better than frequent mental distress and the PHQ-2. The proportion of records with missing values was lower for the 2 single HRQOL items than for the PHQ-2 or the PHQ-8. No measure was better than the others in all respects.
Both the sad/blue/depressed and the PHQ-8 maps showed that the prevalence of depression was highest in the core urban areas of the state and lowest in the more suburban and rural areas. The differences between the 2 distributions may reflect the greater power of the PHQ-8 to discriminate between cities and towns. This discriminatory power may also be reflected in the wider range of values resulting from the PHQ-8 measure than from the sad/blue/ depressed measure.
Rhode Island is a small state, so the study population is homogeneous compared with populations of other states. Our analyses went beyond simple comparisons of test characteristics and test performance to include distinguishing levels of demographic characteristics and risk factors as well as within-state geographic comparisons.
Some limitations of this analysis should be noted. First, the HRQOL indicators are based on a 30-day recall period, while the PHQ-2 and PHQ-8 are based on a 14-day recall period. We have no way to assess the effect of this difference in recall periods. Second, both sets of questions (the HRQOL and the PHQ-8) were asked in the same interview session (ie, they were not context-independent). We have no way to assess the effect of this. Finally, because the calculation of PHQ-8 and PHQ-2 scores require responses to all questions used in calculating scores, it was necessary to impute missing values. In the future, we need to vary the cut points of the 2 single-question screeners and the PHQ-2 to optimize their performance against the PHQ-8 (30).
We conclude that, for Rhode Island, any of the 3 short screeners is sufficiently specific and sensitive to provide population prevalence estimates of depression and can be used for descriptive analyses of our population survey data. Mapping of these depression estimates also indicates localities where the need for mental health services is greatest. To validate the generalizability of our findings, it will be important to replicate them in other states' BRFSS surveys.

Acknowledgments
We thank our colleagues in the Rhode The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention. Tables   Table 1. Comparison of Test Characteristics and Test Performance of 3 Indicators Against PHQ-8, Rhode Island,           Abbreviations: HRQOL, health-related quality of life; D&A, depression and anxiety; AOR, adjusted odds ratio; CI, confidence interval; PHQ, patient health questionnaire. a Data are reported as AORs by all other variables in the model after multiple imputation to account for missing data in survey samples (26,27). AOR is considered significant if its confidence interval does not include 1. b Defined as reporting this indicator for 1 or more days per month. See Methods section for complete variable description. c Obesity was defined as body mass index >0 kg/m 2 .
(Continued on next page)